Skip to content

Add Form.encode, form-escape, and Form.parse-pairs#8

Open
carpentry-agent[bot] wants to merge 1 commit into
masterfrom
claude/form-encode
Open

Add Form.encode, form-escape, and Form.parse-pairs#8
carpentry-agent[bot] wants to merge 1 commit into
masterfrom
claude/form-encode

Conversation

@carpentry-agent

Copy link
Copy Markdown

Summary

The Form module currently only has parse (decode) but no way to produce URL-encoded form bodies. This adds the encoding half of the API:

  • Form.encode — encodes a (Map String String) as a URL-encoded form body (key=val&key2=val2). Spaces become +, other special characters are percent-encoded. This is the inverse of Form.parse.
  • Form.parse-pairs — like Form.parse but returns (Array (Pair String String)) instead of a Map, preserving insertion order and allowing duplicate keys.
  • form-escape — private helper, the symmetric inverse of the existing form-unescape. Splits by space, URI-escapes each part, joins with +.

Testing

13 new tests covering:

  • Single and multi-pair encoding
  • Space → + encoding in both keys and values
  • & and + percent-encoding
  • encode/parse roundtrip
  • parse-pairs order preservation, duplicate keys, + decoding, percent decoding, key-without-value

Note: the test suite has a pre-existing C compilation issue from the time dependency (tm_zone const qualifier mismatch under -Werror). Tests were verified by relaxing that specific warning flag. The one pre-existing test failure (form-data on empty body returns empty map) is unrelated to this change.


Opened by the carpentry-org heartbeat agent (Claude). Veit has not reviewed this yet.

Form module only had parse (decode) but no encode. This adds:

- form-escape: private helper that is the inverse of form-unescape,
  encoding spaces as + and percent-encoding other special characters
- Form.encode: encodes a (Map String String) as a URL-encoded form body,
  the inverse of Form.parse
- Form.parse-pairs: parses into (Array (Pair String String)) preserving
  order and duplicate keys, unlike the Map-based parse

@carpentry-reviewer carpentry-reviewer Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build & Tests

CI: pass (macOS + Ubuntu both green)
Local build: fails with pre-existing TM/tm_zone const-qualifier issue on ARM Linux (from the time dependency) — unrelated to this PR.
Lint: carp-fmt -c and angler both pass clean.

Findings

1. form-escape implementation is correct ✓

http.carp:587-592 — splits by space, URI-escapes each part (which encodes =, &, +, and all other reserved chars), then joins with +. This is the correct inverse of form-unescape (which splits by +, joins with space, then URI-unescapes). Verified against URI.escape in the uri module — it encodes everything except a-z A-Z 0-9 - _ . ~, so reserved form characters (= & +) are all percent-encoded.

Edge cases verified:

  • Double spaces: "a b" → split ["a","","b"] → join "a++b" → roundtrips correctly via form-unescape
  • Empty string: "" → split [""] → escape [""] → join "" → ok
  • Key/value with =: encoded as %3d by URI.escapeparse splits correctly

2. encode accumulator pattern is correct ✓

http.carp:597-603kv-reduce with String.empty? check avoids leading &. Since Map iteration order is unspecified, multi-pair output order is non-deterministic, but this is fine for form encoding.

3. parse-pairs mirrors parse correctly ✓

http.carp:609-621 — same split-by &, index-of \=, form-unescape pattern as parse, but collects into (Array (Pair String String)) instead of Map. Preserves order and duplicates as documented.

4. Test coverage is solid

13 tests cover: single/multi-pair encoding, space→+, &%26, literal +%2b, space in keys, encode/parse roundtrip, parse-pairs count/correctness/order/duplicates/+ decoding/key-without-value/percent decoding. All encode tests use single-pair maps, avoiding Map-order flakiness.

5. Minor gap: no multi-pair roundtrip test (non-blocking)

All encode tests use single-pair maps. A multi-pair encode→parse roundtrip test would add confidence without Map-order flakiness (since Map equality is structural). Not blocking — single-pair roundtrip is sufficient to prove the escaping logic is correct, and the accumulator &-separator logic is straightforward.

6. Pre-existing test failure noted

The PR description correctly identifies the pre-existing form-data on empty body returns empty map failure as unrelated. Confirmed — it's a parse("") edge case that predates this PR.

Verdict: merge

CI green on both platforms, code correctly mirrors existing patterns, tests are thorough, lint clean. The implementation is a clean complement to the existing parse/form-unescape API.

@hellerve hellerve left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebase and adjust the comment style.

@hellerve hellerve marked this pull request as ready for review June 20, 2026 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant